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channel (MAC) with side information at the sources and the decoder. Source-channel separation does not hold 
for this channel. Sufficient conditions are provided for transmission of sources with a given distortion. The source 
and/or the channel could have continuous alphabets (thus Gaussian sources and Gaussian MACs are special cases). 
Various previous results are obtained as special cases. We also provide several good joint source-channel coding 
schemes for discrete sources and discrete/continuous alphabet channel. 
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■ I. Introduction and Survey 

O 

In this paper we consider the transmission of information from several correlated sources over a multiple 
access channel with side information. This system does not satisfy source-channel separation ([12]). Thus 
for optimal transmission one needs to consider joint source-channel coding. We will provide several good 
joint source-channel coding schemes. 

Although this topic has been studied for last several decades, one recent motivation is the problem 
of estimating a random field via sensor networks. Sensor nodes have limited computational and storage 
capabilities and very limited energy [3]. These sensor nodes need to transmit their observations to a 
fusion center which uses this data to estimate the sensed random field. Since transmission is very energy 
intensive, it is important to minimize it. 
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The proximity of the sensing nodes to each other induces high correlations between the observations of 
adjacent sensors. One can exploit these correlations to compress the transmitted data significantly ([3], [4]). 
Furthermore, some of the nodes can be more powerful and act as cluster heads ([4]). Nodes transmit their 
data to a nearby cluster head which can further compress information before transmission to the fusion 
center. Transmission of data from sensor nodes to their cluster-head requires sharing the wireless multiple 
access channel (MAC). At the fusion center the underlying physical process is estimated. The main trade- 
off possible is between the rates at which the sensors send their observations and the distortion incurred in 
the estimation at the fusion center. The availability of side information at the encoders and/or the decoder 
can reduce the rate of transmission ([19], [42]). 

The above considerations open up new interesting problems in multi-user information theory and the 
quest for finding the optimal performance for various models of sources, channels and side information 
have made this an active area of research. The optimal solution is not known except in a few simple 
cases. In this paper a joint source channel coding approach is discussed under various assumptions on side 
information and distortion criteria. Sufficient conditions for transmission of discrete/continuous alphabet 
sources with a given distortion over a discrete/continuous alphabet MAC are provided. These results 
generalize the previous results available on this problem. 

In the following we survey the related literature. Ahlswede [1] and Liao [28] obtained the capacity region 
of a discrete memoryless MAC with independent inputs. Cover, El Gamal and Salehi [12] made further 
significant progress by providing sufficient conditions for transmitting losslessly correlated observations 
over a MAC. They proposed a 'correlation preserving' scheme for transmitting the sources. This mapping 
is extended to a more general system with several principle sources and several side information sources 
subject to cross observations at the encoders in [2]. However single letter characterization of the capacity 
region is still unknown. Indeed Duek [15] proved that the conditions given in [12] are only sufficient and 
may not be necessary. In [26] a finite letter upper bound for the problem is obtained. It is also shown in 
[12] that the source-channel separation does not hold in this case. The authors of [35] obtain a condition 
for separation to hold in a multiple access channel. 

The capacity region for the distributed lossless source coding problem for correlated sources is given in 
the classic paper by Slepian and Wolf ([38]). Cover ([11]) extended Slepian-Wolf results to an arbitrary 
number of discrete, ergodic sources using a technique called 'random binning'. Other related papers on 
this problem are [2], [6]. 



Inspired by Slepian-Wolf results, Wyner and Ziv [42] obtained the rate distortion function for source 
coding with side information at the decoder. It is shown that the knowledge of side information at the 
encoders in addition to the decoder, permits the transmission at a lower rate. This is in contrast to the 
lossless case considered by Slepian and Wolf. The rate distortion function when encoder and decoder both 
have side information was first obtained by Gray (See [8]). Related work on side information coding is 
[5], [14], [33]. The lossy version of Slepian-Wolf problem is called multi-terminal source coding problem 
and despite numerous attempts (e.g., [9], [30]) the exact rate region is not known except for a few special 
cases. First major advancement was in Berger and Tung ([8]) where an inner and an outer bound on the 
rate distortion region was obtained. Lossy coding of continuous sources at the high resolution limit is 
studied in [43] where an explicit single-letter bound is obtained. Gastpar ([19]) derived an inner and an 
outer bound with decoder side information and proved the tightness of his bounds when the sources are 
conditionally independent given the side information. The authors in [39] obtain inner and outer bounds 
on the rate region with side information at the encoders and the decoder. In [29] an achievable rate region 
for a MAC with correlated sources and feedback is given. 

The distributed Gaussian source coding problem is discussed in [30], [41]. For two users exact rate 
region is provided in [41]. The capacity of a Gaussian MAC (GMAC) for independent sources with 
feedback is given in [32]. In [27] one necessary and two sufficient conditions for transmitting a bivariate 
jointly Gaussian source over a GMAC are provided. It is shown that the amplify and forward scheme is 
optimal below a certain SNR. The performance comparison of the schemes given in [27] with a separation- 
based scheme is given in [34]. GMAC under received power constraints is studied in [18] and it is shown 
that the source-channel separation holds in this case. 

In [20] the authors discuss a joint source channel coding scheme over a MAC and show the scaling 
behavior for the Gaussian channel. A Gaussian sensor network in distributed and collaborative setting is 
studied in [24]. The authors show that it is better to compress the local estimates than to compress the raw 
data. The scaling laws for a many-to-one data-gathering channel are discussed in [17]. It is shown that 
the transport capacity of the network scales as 0(logN) when the number of sensors iV grows to infinity 
and the total average power remains fixed. The scaling laws for the problem without side information are 
also discussed in [21] and it is shown that separating source coding from channel coding may require 
exponential growth, as a function of number of sensors, in communication bandwidth. A lower bound 
on best achievable distortion as a function of the number of sensors, total transmit power, the degrees of 



freedom of the underlying process and the spatio-temporal communication bandwidth is given. 

The joint source-channel coding problem also bears relationship to the CEO problem [10]. In 
this problem, multiple encoders observe different, noisy versions of a single information source and 
communicate it to a single decoder called the CEO which is required to reconstruct the source within a 
certain distortion. The Gaussian version of the CEO problem is studied in [31]. 

This paper makes the following contributions. It obtains sufficient conditions for transmission of 
correlated sources with given distortions over a MAC with side information. The source/channel alphabets 
can be discrete or continuous. The sufficient conditions are strong enough that previous known results 
are special cases. Next we obtain a bit to Gaussian mapping which provides correlated Gaussian channel 
codewords for discrete distributed sources. 

The paper is organized as follows. Sufficient conditions for transmission of distributed sources over 
a MAC with side information and given distortion are obtained in Section |nj The sources and the 
channel alphabets can be continuous or discrete. Several previous results are recovered as special cases 
in Section [TTTJ Section HVl considers the important case of transmission of discrete correlated sources over 
a GMAC and presents a new joint source-channel coding scheme. Section [V] briefly considers Gaussian 
sources over a GMAC. Section [VI] concludes the paper. The proof of the main theorem is given in 
Appendix A. The proofs of several other results are provided in later appendices. 



II. Transmission of correlated sources over a MAC 

We consider the transmission of memoryless dependent sources, through a memoryless multiple access 
channel (Fig. [T]). The sources and/or the channel input/output alphabets can be discrete or continuous. 
Furthermore, side information about the transmitted information may be available at the encoders and the 
decoder. Thus our system is very general and covers many systems studied earlier. 
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Fig. 1. Transmission of correlated sources over a MAC with side information. 



Initially we consider two sources (U 1: U 2 ) and side information random variables Z 1: Z 2 ,Z with a known 
joint distribution F(u 1: u 2) Zi, z 2) z). Side information is available to encoder i, i = 1, 2 and the decoder 
has side information Z. The random vector sequence {(Ui n ,U 2n , Z ln , Z 2n , Z n ),n > 1} formed from the 
source outputs and the side information with distribution F is independent identically distributed (iid) in 
time. We will denote {U±k : k — 1, n} by [/". Similarly for other sequences. The sources transmit their 
codewords X in 's to a single decoder through a memoryless multiple access channel. The channel output 
F has distribution p(y\x 1: x 2 ) if x 1 and :r 2 are transmitted at that time. Thus, {Y n } and {X ln , X 2n } satisfy 
p{yk\y k ~ 1 ,x l {,x 2 ) = p(yk\xik,x 2 k). The decoder receives Y n and also has access to the side information 
Z n . The encoders at the two users do not communicate with each other except via the side information. 
The decoder uses the channel outputs and its side information to estimate the sensor observations U in 
as U in , % — 1,2. It is of interest to find encoders and a decoder such that {Ui n ,U 2n ,n > 1} can be 
transmitted over the given MAC with E[di(Ui, Ui)} < D x and E[d 2 {U 2 , U 2 )\ < D 2 where di are non- 
negative distortion measures and Dj are the given distortion constraints. If the distortion measures are 
unbounded we assume that there exist u* such that E[di(U i: u*)\ < oo, i — 1,2. This covers the important 
special case of mean square error (MSE) if E[Uf] < oo, % — 1, 2. 

Source channel separation does not hold in this case. 

For discrete sources a common distortion measure is Hamming distance, 



d(x, x) 



1, if x 7^ x' ', 
0, if x = x'. 



For continuous alphabet sources the most common distortion measure is d(x,x') — (x — x') 2 . To obtain 
the results for lossless case from our Theorem 1 below, we assume that di(x,x') = x = x', e.g., 
Hamming distance. 

Definition: The source (£/", U 2 ) can be transmitted over the multiple access channel with distortions 
D=(D 1 , D 2 ) if for any e > there is an n such that for all n > n there exist encoders f% i : x — * 



< Di + e, i = 1,2 



XI 1 , % = 1, 2 and a decoder f^-.y n x Z n -> (Wf, W 2 n ) such that ^]" =1 ^(C/^, C/" 0; 
where (C>r,C/ 2 n ) = / D (F",Z n ) and Z^, ^, Z, ^ are the sets in which U h Z h Z, X h Y, Ui 

take values. 

We denote the joint distribution of (Ui, U 2 ) by p(u 1: u 2 ). Also, X <-> F <-> Z will denote that {X, F, Z} 
form a Markov chain. 



Now we state the main Theorem. 

Theorem 1: A source can be transmitted over the multiple access channel with distortions (D 1 , D 2 ) if 
there exist random variables (Wi, W 2 , X±, X 2 ) such that 

(1) p(u h it 2 , z u z 2 , z, w u w 2 , x 1 ,x 2 , y) = p(u lt «2, *l> *2, z)p(wi\u 1 , z 1 )p(w 2 \u 2 , z 2 ). 

p(xx\w x )p{x 2 \w 2 )p(y\x u x 2 ) 

and 

(2) there exists a function fjj : Wi x W 2 x Z — ► (M\ x W 2 ) such that E[d(Ui, Ui)] < D h i = 1, 2, where 
(C7i, O2) = /d(Wi, Z) and the constraints 

/(Cr^ZuWilW^Z) < /(Xi;y|X 2 ,W 2 ,Z), 

/(f/ 2 ,Z 2 ;^ 2 |^ 1 ,Z) < /(X 2 ;r|Xi,Wi,Z), (1) 
/(^^^Zaj^WalZ) < J(Xi,X 2 ;r|Z) 

are satisfied where W, are the sets in which Wi take values. 

Proof: See Appendix A. ■ 

In the proof of Theorem Q] the encoding scheme involves distributed vector quantization (W", W 7 ^) 
of the sources (Ui,U 2 ) and the side information Z™,Z 2 followed by a correlation preserving mapping 
to the channel codewords (X^X^). The decoding approach involves first decoding (W™,W 2 ) and then 
obtaining the estimates (Ui,U 2 ) as a function of (W™, W 2 ) and the decoder side information Z n . 

If the channel alphabets are continuous (e.g., GMAC) then in addition to the conditions in Theorem Q] 
certain power constraints -EfJ^ 2 ] < P h % = 1,2 are also needed. In general, we could impose a more 
general constraint E[gi(Xj)\ < oil where gi is some non-negative cost function. Furthermore, for continuous 
alphabet r.v.s (sources/channel input/output) we will assume that probability density exists so that one can 
use differential entropy (more general cases can be handled but for simplicity we will ignore them). 

The dependence in (Ui,U 2 ) is used in two ways in (0Q): to reduce the quantities on the left and to 
increase the quantities on the right. The side information Z\ and Z 2 effectively increases the dependence 
in the inputs. 

If the source-channel separation holds then one can consider the capacity region of the channel. For 
example, when there is no side information Zi, Z 2 , Z and the sources are independent then we obtain the 



rate region 



R 1 <I(X 1 ;Y\X 2 ), R 2 <I(X 2 ;Y\X 1 ), R x + R 2 < I(X 1 , X 2 ; Y). (2) 

This is the well known rate region of a MAC ([13]). To obtain © from (OQ), take (Z x , Z 2 , Z) independent 
of (Z7i, U 2 ). Also, take U\, U 2 discrete, Wi = Ui and Xi independent of Z7», i = 1, 2. 

In Theorem \T\ it is possible to include other distortion constraints. For example, in addition to the 
bounds on E[d(Ui, Ui)] one may want a bound on the joint distortion E[d((Ui, U 2 ), (Ui, U 2 ))\. Then the 
only modification needed in the statement of the above theorem is to include this also as a condition in 
defining f D . 

If we only want to estimate a function g(Ui, U 2 ) at the decoder and not (Ui, U 2 ) themselves, then again 
one can use the techniques in proof of Theorem \T\ to obtain sufficient conditions. Depending upon g, the 
conditions needed may be weaker than those needed in ©. We will explore this in more detail in a later 
work. 

In our problem setup the side information Zi can be included with source Ui and then we can consider 
this problem as one with no side information at the encoders. However, the above formulation has the 
advantage that our conditions (OQ) are explicit in Zi. 

The main problem in using Theorem Q] is in obtaining good source-channel coding schemes providing 
(Wi, W 2 , Xi, X 2 ) which satisfy the conditions in the theorem for a given source (U\, U 2 ) and a channel. 
A substantial part of this paper will be devoted to this problem. 

A. Extension to multiple sources 

The above results can be generalized to the multiple (> 2) source case. Let S = 1, 2, M be the set 
of sources with joint distribution p(ui, ...,%). 

Theorem 2: Sources (£7™, i E S) can be communicated in a distributed fashion over the memoryless 
multiple access channel p(y\xi, i G S) with distortions (A, i G «S) if there exist auxiliary random variables 

(Wi,Xi,i G S) satisfying 

(1) p(ui,Zi,z,Wi,Xi,y,i E S) =p(ui,Zi,z,i E S)p{y\xi,i E S) JJp(w 3 -|%, Zj)p(xj\wj), 

(2) there exists a function f D : Yljes x Z — > (Ui, i E S) such that E[d(U i: Ui)} < Di, i E S and the 



constraints 

I(U A , Z A ; W A \W A c, Z) < I(X A ; Y\X A ., W A *, Z), for all A C S (3) 

are satisfied where U A = (Ui, i E A), A c is the complement of set A and similarly for other r.v.s (in case 
of continuous channel alphabets we also need the power constraints < Pi, z = l,...,|<S|). 

B. Example 

We provide an example to show the reduction possible in transmission rates by exploiting the correlation 
between the sources, the side information and the permissible distortions. 

Consider (U u U 2 ) with the joint distribution: P{U X = 0; U 2 = 0) = P(E7i = 1; U 2 = 1) = 1/3; P(Ux = 
1;U 2 = 0) = P(Ui = 0;U 2 = 1) = 1/6. If we use independent encoders which do not exploit the 
correlation among the sources then we need R\ > H(U\) = 1 bit and R 2 > H{U 2 ) = 1 bit for lossless 
coding of the sources. If we use Slepian-Wolf coding ([38]), then Ri > H(Ui\U 2 ) = 0.918 bits,R 2 > 
H(U 2 \U!) = 0.918 bits and R 1 + R 2 > H(U U U 2 ) = 1.918 bits suffice. 

Next consider a multiple access channel such that Y = X% + X 2 where X\ and X 2 take values from 
the alphabet {0, 1} and Y takes values from the alphabet {0, 1,2}. This does not satisfy the separation 
conditions in [35]. The sum capacity C of such a channel with independent X\ and X 2 is 1.5 bits and if we 
use source-channel separation, the given sources cannot be transmitted losslessly because H(Ui, U 2 ) > C. 
Now we use a joint source-channel code to improve the capacity of the channel. Take X\ — U\ and 
X 2 = U 2 . Then the sum rate capacity of the channel is improved to I(Xi,X 2 ; Y) = 1.585 bits. This is 
still not enough to transmit the sources over the given MAC. Next we exploit the side information. 

Let the side-information random variables be generated as follows. Z\ is transmitted from source 2 by 
using a (low rate) binary symmetric channel (BSC) with cross over probability p = 0.3. Similarly Z 2 is 
transmitted from source 1 via a similar BSC. Let Z = Z 2 , V), where V = Ui.U 2 .N, N is a binary 
random variable with P(N = 0) = P(N = 1) = 0.5 independent of U\ and U 2 and '.' denotes the logical 
AND operation. This denotes the case when the decoder has access to the encoder side information and 
also has some extra side information. Then from (OQ) if we use just the side information Z\ the sum rate 
for the sources needs to be 1.8 bits. By symmetry the same holds if we only have Z 2 . If we use Z\ and 
Z 2 then we can use the sum rate 1.683 bits. If only V is used then the sum rate needed is 1.606 bits. 
So far we can still not transmit (Ui,U 2 ) losslessly if we use the coding C/j = X { , i — 1, 2. If all the 
information in Zi, Z 2 , V is used then we need Ri + R 2 > 1.4120 bits. Thus with the aid of Z\, Z 2 , Z we 



can transmit (Ui, U 2 ) losslessly over the MAC even with independent X 1 and X 2 . 

Next we consider the distortion criterion to be the Hamming distance and the allowable distortion as 
4%. Then for compressing the individual sources without side information we need Ri > H(p) — H(d) = 
0.758 bits, i = 1,2, where H(x) = —xlog 2 {x) — (l—x)log 2 (l — x). Thus we still cannot transmit (U±, U 2 ) 
with this distortion when (X x , X 2 ) are independent. Next assume the side information Z = (Z X ,Z 2 ) to 
be available at the decoder only. Then we need R x > I(U X ; W\) — 1{Z\\ W\) where W\ is an auxiliary 
random variable generated from U\. This implies that R\ > 0.6577 bits and R 2 > 0.6577 bits and we 
can transmit with independent X 1 and X 2 . 



III. Special Cases 

In the following we show that our result contains several previous studies as special cases. The practically 
important special case of GMAC will be studied in detail in later sections. There we will discuss several 
specific joint source-channel coding schemes for GMAC and compare their performance. 



A. Lossless multiple access communication with correlated sources 

Take (Zi, Z 2 , Z)±(U 1 ,U 2 ) (X±Y denotes that r.v. X is independent of r.v. Y) and W\ = U x and 
W 2 = U 2 where Ui,U 2 are discrete sources. Then the constraints of (QQ) reduce to 

#(C/i|C/ 2 ) < I{X i; Y\X 2 ,U 2 ), H(U 2 \Ux) < /(XajYlXj,^), H{U h U 2 ) < I{X 1 ,X 2 ;Y) (4) 

where X\ U% U 2 X 2 . These are the conditions obtained in [12]. 

If U x , U 2 are independent, then H(Ui\U 2 ) = H{U X ) and I(Xr, Y\X 2 , U 2 ) = I(X X ] Y\X 2 ). 

B. Lossy multiple access communication 

Take (Z x , Z 2) Z)±(Ui, U 2) W\, W 2 ) . In this case the constraints in (Q~|) reduce to 

/(C/ i; W X \W 2 ) < I{X X] Y\X 2 , W 2 ), I{U 2 - W 2 \W X ) < I(X 2 ; Y\X U W x ), 

I(U h U 2 ; W 1} W 2 ) < I{X 1 , X 2 - Y). (5) 



This is an immediate generalization of [12] to the lossy case. 



C. Lossless multiple access communication with common information 

Consider Ui = (U[, U' Q ), U 2 = (U 2 , U' ) where Uq, U{, U 2 are independent of each other. U' is interpreted 
as the common information at the two encoders. Then, taking (Z±, Z 2 , Z)±(Ui, U 2 ), W\ = U\ and W 2 = 
U 2 we obtain sufficient conditions for lossless transmission as 



This provides the capacity region of the MAC with common information available in [37]. 
Our results generalize this result to lossy transmission also. 

D. Lossy distributed source coding with side information 

The multiple access channel is taken as a dummy channel which reproduces its inputs. In this case 
we obtain that the sources can be coded with rates Ri and R 2 to obtain the specified distortions at the 
decoder if 



H(U[) < J(X i; Y\X 2 , U' Q ), H(U' 2 ) < I(X 2 ; Y\X U U' Q ), 



H(U[) + H(U' 2 ) + H{U' Q ) < I{X U X 2] Y). 



(6) 



R l > I(U 1: Z i; W ± \W 2: Z), R 2 > I(U 2 , Z 2 ; W 2 \W U Z), 



R t + R 2 > I(U U U 2 , Z u Z 2 ; W u W 2 \Z) 



(7) 



where R x , R 2 are obtained by taking Xi±X 2 . 



This recovers the result in [39], and generalizes the results in [19], [38], [42]. 



E. Correlated sources with lossless transmission over MAC with receiver side information 



If we consider (Z±, Z 2 )±(Ui, U 2 ), W\ = U\ and W 2 = U 2 then we recover the conditions 



H{U X \U 2 , Z) < J(X i; Y\X 2 , U 2 , Z), H(U 2 \U U Z) < I(X 2] Y\X U U ± , Z), 



H(U 1 ,U 2 \Z)<I(X 1 ,X 2 ;Y\Z) 



(8) 



in Theorem 2.1 in [23]. 



F. Mixed Side Information 

The aim is to determine the rate distortion function for transmitting a source X with the aid of side 
information (Y, Z) (system in Fig 1(c) of [16]). The encoder is provided with Y and the decoder has 



access to both Y and Z. This represents the Mixed side information (MSI) system which combines the 
conditional rate distortion system and the Wyner-Ziv system. This has the system in Fig 1(a) and (b) of 
[16] as special cases. 

The results of Fig 1(c) can be recovered from our Theorem if we take X,Y,Z,W in [16] as U x = 
X, Z = (Z, Y), Z\ = Y and W x = W. We also take U 2 and Z 2 to be constants. The acceptable rate region 
is given by R > I(X; W\Y, Z), where W is a random variable with the property W <-> (X, Y) <-* Z and 
for which there exists a decoder function such that the distortion constraints are met. 

G. Compound MAC and Interference channel with side information 

In compound MAC sources U x and U 2 are transmitted through a MAC which has two outputs Y\ and 
Y2. Decoder % is provided with Yi and Zi, i = 1,2. Each decoder is supposed to reconstruct both the 
sources. We take W x = U x and W 2 = U 2 . We can consider this system as two MAC's. Applying £[]) twice 
we have for i = 1,2, 

H(U 1 \U 2 ,Z i )<I{X 1 ;Y i \X 2 ,U 2 ,Z i ), H(U 2 \U 1 , Zi) < I{X 2 ; Y i \X 1 , U 1 , Z t ), 

Hi\J u U 2 \Z i )<l[X u X 2] Y l \Z l ). (9) 

This recoves the achievability result in [22]. This provides the achievability conditions in [22] for strong 
interference channel conditions also. 

H. Correlated sources over orthogonal channels with side information 

The sources transmit their codewords Xj's to a single decoder through memory less orthogonal channels 
having transition probabilities p(y\\x\) and p(y 2 \x 2 ). Hence in the theorem, Y = (Yi, Y 2 ) and Y\ «-> X\ «-> 
W\ <-> (Ui, Zi) <->• (U 2 , Z 2 ) W 2 <-+ X 2 Y 2 . In this case the constraints in (QQ) reduce to 

I{U U Z X -W X \W 2 ,Z) < I{X X ;Y X \W 2 ,Z)<I{X X] Y X ), 

I(U 2 ,Z 2 ;W 2 \W U Z) < I(X 2 ;Y 2 \W l7 Z)<I(X 2 ;Y 2 ), (10) 
I(U X ,U 2 ,Z X ,Z 2 ;W X ,W 2 \Z) < I(X h X 2 -Y h Y 2 \Z)<I(X x ;Y x ) + I(X 2 -Y 2 ). 

The outer bounds in (flOl) are attained if the channel codewords (X X ,X 2 ) are independent of each 
other. Also, the distribution of (X X ,X 2 ) maximizing these bounds are not dependent on the distribution 

of (U U U 2 ). 



Using Fano's inequality, for lossless transmission of discrete sources over discrete channels with side 
information, we can show that outer bounds in (flOl) are in fact necessary and sufficient. The proof of the 
converse is given in Appendix B. 

If we take W\ = U\ and W 2 = U 2 and the side information (Zx, Z 2 , Z)-L(Ux, U 2 ), we can recover the 
necessary and sufficient conditions in [7]. 



/. Gaussian sources over a Gaussian MAC 

Let (Ui, U 2 ) be jointly Gaussian with mean zero, variances of, i = 1, 2 and correlation p. These sources 
have to be communicated over a Gaussian MAC with the output Y n at time n given by Y n = Xi n +X 2n +N n 
where X\ n and X 2n are the channel inputs at time n and N n is a Gaussian random variable independent 
of Xi n and X 2n , with E[N n ] = and var(N n ) = a 2 N . The power constaints are < Pi, i = 1,2. 

The distortion measure is the mean square error (MSE). We take (Zx, Z 2 , Z)_L(C/i, U 2 ). We choose W\ 
and W 2 according to the coding scheme given in [27]. X\ and X 2 are scaled versions of W\ and W 2 
respectively. Then from (OQ) we find that the rates at which W\ and W 2 are encoded satisfy 



i?i < 0.5 log 



_a N 2 (1 - p 



73. \ 



R 2 < 0.5 log 



P 2+ 1 



L<v (i-p 2 ). 



Ri + R 2 < 0.5 log 



a N 2 + P l + P 2 + 2~pJT\P 2 



(1 - ~P 2 )°N 2 

where p is the correlation between X\ and X 2 . The distortions achieved are 



D x > var(Ux\Wx,W 2 ) 
D 2 > var(U 2 \Wx,W 2 ) 
This recovers the sufficient conditions in [27]. 



(1-P 2 ) 
a 2 2 2- 2R * [1 - p 2 (1 - 2- 2R i)} 



1-P 



«2 ^ 



(ID 



IV. Discrete Alphabet Sources over Gaussian MAC 

This system is practically very useful. For example, in a sensor network, the observations sensed by 
the sensor nodes are discretized and then transmitted over a GMAC. The physical proximity of the sensor 
nodes makes their observations correlated. This correlation can be exploited to compress the transmitted 
data and increase the channel capacity. We present a novel distributed 'correlation preserving' joint source- 



channel coding scheme yielding jointly Gaussian channel codewords which transmit the data efficiently 
over a GMAC. 

Sufficient conditions for lossless transmission of two discrete correlated sources (Ui,U 2 ) (generating 
iid sequences in time) over a general MAC with no side information are obtained in ©. 

In this section, we further specialize these results to a GMAC: Y = Xi + X 2 + iV where N is a Gaussian 
random variable independent of X\ and X 2 . The noise N satisfies E[N] = and Var(N) = a 2 N . We 
will also have the transmit power constraints: i?[Xf] < Pi,i — 1,2. Since source-channel separation does 
not hold for this system, a joint source-channel coding scheme is needed for optimal performance. 

The dependence of right hand side (RHS) in © on input alphabets prevents us from getting a closed 
form expression for the admissibility criterion. Therefore we relax the conditions by taking away the 
dependence on the input alphabets to obtain good joint source-channel codes. 

Lemma 1: Under our assumptions, I(X X ; Y\X 2 , U 2 ) < I{X\\ Y\X 2 ). 

Proof: See Appendix Hill ■ 
Thus from ©, 

H{Ux\U 2 ) < I{X X -Y\X 2 ,U 2 ) <I(X i; Y\X 2 ), (12) 
H{U 2 \U X ) < IiX^Y^U^KIiX^YlXt), (13) 
H(Ui,U 2 ) < I(Xx,X 2 ;Y). (14) 

The relaxation of the upper bounds is only in (fT2l) and (fT3l) and not in (fT4l) . 

We show that the relaxed upper bounds are maximized if {X\, X 2 ) is jointly Gaussian and the correlation 
p between X\ and X 2 is high (the highest possible p may not give the largest upper bound in (fT2l) -([T4l)). 

Lemma 2: A jointly Gaussian distribution for (Xi,X 2 ) maximizes I(Xi;Y\X 2 ), I(X 2 ;Y\X{) and 
I(Xi, X 2 ; Y) simultaneously. 

Proof: See Appendix UIT1 ■ 

The difference between the bounds in (fT2l) is 

I(X U Y\X 2 ) - I(X U Y\X 2 , U 2 ) = I{X X + N; U 2 \X 2 ). (15) 

This difference is small if correlation between (U x , U 2 ) is small. In that case H(Ui\U 2 ) and H(U 2 \Ui) will 
be large and (fT2l) and (TT3l can be active constraints. If correlation between (Ui,U 2 ) is large, H(U X \U 2 ) 
and H(U 2 \Ui) will be small and (fl4l) will be the only active constraint. In this case the difference between 



the two bounds in (fT2l) and <fT3l is large but not important. Thus, the outer bounds in (fT2l) and (TT3T ) are 
close to the inner bounds whenever the constraints (fT2l) and (fT~3T ) are active. Often ([141) will be the only 
active constraint. 

Based on Lemma [2l we use jointly Gaussian channel inputs (Xi,X 2 ) with the transmit power 
constraints. Thus we take (X 1? X 2 ) with mean vector [0 0] and covariance matrix K Xly x 2 = 

The outer bounds in (fT2l) -([T4l) become 0.5 log 



pJT\P 2 P 2 



1 + ^1 



0.5 log 



1 + i MiV) 



and 0.5 log 



1 + 



Pl+P2+2pVF\P2 



respectively. The first two upper bounds decrease 



as p increases. But the third upper bound increases with p and often the third constraint is the limiting 
constraint. Thus, once (Xi,X 2 ) are obtained we can check for sufficient conditions ©. If these are not 
satisfied for the (X 1; X 2 ) obtained, we will increase the correlation p between (X 1 ,X 2 ) if possible (see 
details below). Increasing the correlation in (Xi,X 2 ) will decrease the difference in (fT~5b and increase the 
possibility of satisfying (|4]) when the outer bounds in (fT"2)) and (fT"3l) are satisfied. If not, we can increase 
p further till we satisfy ©. 

The next lemma provides an upper bound on the correlation p between (Xi,X 2 ) possible in terms of 
the distribution of (Ui, U 2 ). 

Lemma 3: Let (Z7i, C/jj) be the correlated sources and X\ «-> U\ «-> U 2 «-> X 2 where X\ and X 2 are 
jointly Gaussian. Then the correlation p between (Xi,X 2 ) satisfies p 2 < 1 — 2~ 2I ( Ul ' U2 \ 

Proof: See Appendix UITl ■ 

It is stated in [35], without proof, that the correlation between (X 1 ,X 2 ) cannot be greater than the 
correlation of the source (Ui,U 2 ). Lemma 3 gives a tighter bound in many cases. Consider (Ui,U 2 ) with 
the joint distribution: P{U X = 0; U 2 = 0) = P{U X = 1; U 2 = 1) = 0.4444; P(U X = 1; U 2 = 0) = P{U X = 
0; U 2 = 1) = 0.0556. The correlation between the sources is 0.7778 but from Lemma 3, the correlation 
between (Xi,X 2 ) cannot exceed 0.7055. 

A. A coding Scheme 

In this section we develop a distributed coding scheme for mapping the discrete alphabets (Ui,U 2 ) 
into jointly Gaussian correlated code words (Xi,X 2 ) which satisfy © and the Markov condition. The 
heart of the scheme is to approximate a jointly Gaussian distribution with the sum of product of Gaussian 
marginals. Although this is stated in the following lemma for two dimensional vectors (X 1? X 2 ), the results 
hold for any finite dimensional vectors (hence can be used for any number of users sharing the MAC). 



Lemma 4: Any jointly Gaussian two dimensional density can be uniformly arbitrarily closely approx- 
imated by a weighted sum of product of marginal Gaussian densities: 

N 



From the above lemma we can form a sequence of functions f n (xi,x 2 ) of type (fT6l) such that 
sup xl>X2 \f n (xx, x 2 ) — f{x\, x 2 )\ — > as n — > oo, where / is a given jointly Gaussian density. Although /„ 
are not guaranteed to be probability densities, due to uniform convergence, for large n, they will almost 
be. In the following lemma we will assume that we have made the minor modification to ensure that /„ is 
a proper density for large enough n. This lemma shows that obtaining (X 1 ,X 2 ) from such approximations 
can provide the (relaxed) upper bounds in (fT2l - (fl4l) (we actually show for the third inequality only but 
this can be shown for the other inequalities in the same way). Of course, as mentioned earlier, then these 
can be used to obtain the (Xi,X 2 ) which satisfy the actual bounds in ©. 

Let (X m i,X m2 ) and (Xi,X 2 ) be random variables with densities f m and / and sup x ^ X2 \f m (xi,x 2 ) — 
f(xi,x 2 )\ — * as m — ► oo. Let Y m and Y denote the corresponding channel outputs. 

Lemma 5: For the random variables defined above, if {log f m (Y m ) , m > 1} is uniformly integrable, 

I(X ml , X m2 ; Y m ) -> Ipfi, X 2 ; Y) as m -> oo. 

Proof: See Appendix UTO ■ 
A set of sufficient conditions for uniform integrability of {logf m (Y m ),m > 1} is 

(1) Number of components in (fT6l) is upper bounded. 

(2) Variance of component densities in (fT6l) is upper bounded and lower bounded away from zero. 

(3) The means of the component densities in (fT6l) are in a bounded set. 

From Lemma |4] a joint Gaussian density with any correlation can be expressed by a linear combination 
of marginal Gaussian densities. But the coefficients p { and q { in (fT6b may be positive or negative. To 
realize our coding scheme, we would like to have the p^s and g^'s to be non negative. This introduces 
constraints on the realizable Gaussian densities in our coding scheme. For example, from Lemma |3l the 
correlation p between X x and X 2 cannot exceed yl — 2~ 2I( - Ul ' U2 \ Also there is still the question of getting 
a good linear combination of marginal densities to obtain the joint density for a given N in (TT6l) . 




(16) 



Proof: See Appendix Hill 



This motivates us to consider an optimization procedure for finding p u % an, a 2 i, Cu and c 2i in (fT6l ) 
that provides the best approximation to a given joint Gaussian density. We illustrate this with an example. 
Consider U u U 2 to be binary. Let P(U X = 0; U 2 = 0) = p 00 ] P{U X = 0; t/ 2 = 1) = poi! = 1; «7 2 = 
0) — Pio an d P{U\ = l;U 2 = 1) = Pn- Define (notation in the following has been slightly changed 
compared to ([ToT) ) 

f(X 1 = .\Ui = 0) = ?WV(aioi, c 10 i) +Pio2A/"(aio2,Cio 2 ) 

• • • + PlOn-A/" (fliOn > c 10ri ), (17) 

/(Jfj = .|t/i = 1) = PmJV(am, c U i) + Pn 2 ./V(ai 12 , c U2 ) 

... + ftir 2 Af(aii r2 , cn r2 ), (18) 
f(X 2 = .\U 2 = 0) = p 2 oiJV(a 20 i,c 20 i) +P202A/"(a 202 , c 202 ) 

... + p 2 0r3A/"(a 2 0r 3 , C 20 r 3 ), (19) 

f(X 2 = .\U 2 = 1) = p2iiA/"(a 2 ii,C2ii) +P2i2A/"(a 2 i2, C212) 

• • • + £>21r 4 A/"(a 2 lr 4 , C 2 lr 4 ) • (20) 

where J\f(a, b) denotes Gaussian density with mean a and variance b. Let p be the vector with components 
P101, -iPion, Pin, ...,Piir 2 , P201, -,P20r 3 > P211, —,P2ir 4 - Similarly we denote by a and c the vectors 
with components aioi, aion* a m, ••■> a iir 2 > a 2oi ; o 2 o r3 , 0211 ; o 2 i r4 and cioi, cio n , Cm, Cn r2 , 
C201, •••,c 2 or 3 , c 2 n,..., c 2 i r4 . The mixture of Gaussian densities (fT7l)-(l20l) will be used to obtain the RHS in 
(fT6l) for an optimal approximation. For a given p, a, c, the resulting joint density is g p& ,c = Poo/P^i — 
.|C/! = 0)/(X 2 = .|C/ 2 = 0) + p„i/(*i = -Pi = 0)f(X 2 = .\U 2 = l)+ Pw f(X 1 = .\U X = l)f(X 2 = 
■ \U 2 = 0)+p n f(X 1 = .\U X = l)f(X 2 = .\U 2 = 1). 
Let f p (xi, x 2 ) be the jointly Gaussian density that we want to approximate. Let it has zero mean and 

(1 P \ 

covariance matrix K Xl x 2 = \ ■ The best g pg ,c is obtained by solving the minimization problem: 

V V 

min P,a,cj \gp&c(x 1 ,x 2 ) - fp{x x ,x 2 )] 2 dx 1 dx 2 (21) 

subject to 

n r 2 

(Poo +P01) ^Pioidwi + {pio +P11) Z^Pmaiii = 0, 



i=l i=l 



r 3 T4 

(POO + Pio) ^P20iO 2 0i + (Pol + Pll) y^P21jQ21i = 0, 
i=l i=l 



ri r2 

(poo + P01) 5^Pioi(cioi + a ioi) + (Pio +Pn) 5Zpii»(ciii + a? M ) = 1, 
1=1 1=1 

(POO +Plo) 5^P20i(c 2 0i + «20i) + (POI +Pll) ^Jfcufolt + a 21i) = X > 

1=1 8=1 
ri r 2 r 3 r 4 

J^Pioi = !> J^Pm = = X ' X^ 21< = 

1=1 1=1 1=1 8=1 

Pioi > 0,Cioi > for i E {l,2...n}, pi U > 0, c ni >0 for i E {l,2...r 2 }, 
P20i > 0,c 20 i > for i E {l,2...r 3 }, p 21i > 0,c 21i > for i E {l,2...r 4 }. 

The above constraints are such that the resulting distribution g for (X 1 ,X 2 ) will satisfy E[Xj\ = and 
E[Xf] = l, i = l,2. 

The above coding scheme will be used to obtain a codebook as follows. If user 1 produces U\ = 0, then 
independently with probability p Wi the encoder 1 obtains codeword X\ from the distribution Af(awi, cioi) 
independently of other codewords. Similarly we obtain the codewords for U\ = 1 and for user 2. Once 
we have found the encoder maps the encoding and decoding are as described in the proof of Theorem 1. 
The decoding is done by joint typicality of the received Y n with (£7™, U 2 ). 

This coding scheme can be extended to any discrete alphabet case. We give an example below to 
illustrate the coding scheme. 

B. Example 

Consider (U U U 2 ) with the joint distribution: P{U X = 0; U 2 = 0) = P(U X = 1; U 2 = 1) = P{U X = 
0; U 2 = l) = 1/3; P(£A = 1; U 2 = 0) = and power constraints P x = 3; P 2 = 4. Also consider a GMAC 
with crjy = 1. If the sources are mapped into independent channel code words, then the sum rate condition 
in (HU) with p = should hold. The LHS evaluates to 1.585 bits whereas the RHS is 1.5 bits. Thus (fl4l) 
is violated and hence the sufficient conditions in © are also violated. 

In the following we explore the possibility of using correlated (X X ,X 2 ) to see if we can transmit this 



source on the given MAC. The inputs (Ui, U 2 ) can be distributedly mapped to jointly Gaussian channel 
code words (Xi,X 2 ) by the technique mentioned above. The maximum p which satisfies upper bounds 
in (fT2l and (fT3l are 0.7024 and 0.7874 respectively and the minimum p which satisfies (fl~4l) is 0.144. 
From Lemma [31 p is upper bounded by 0.546. Therefore we want to obtain jointly Gaussian (Xt, X 2 ) 
satisfying X± <-> XJ\ <-> U 2 <-> X 2 with correlation p E [0.144,0.546]. If we choose p = 0.3, it meets the 
inner bounds in G2b-fl3) (i.e., the bounds in ©): I(X\\ Y\X 2 , U 2 ) = 0.792, I(X 2 ; Y\X U C/i) = 0.996, 
J(X i; y|X 2 ) = 0.949, J(X 2 ;y|Xi) = 1.107, F(£/ii£/ 2 ) = H^Ut) = 0.66. 

We choose r» = 2, i = 1, 4 and solve the optimization problem (|2TT) via MATLAB to get the function 
g. The optimal solution solution has both component distributions in (fTTT) - (|2Q|) same and these are 

f(X i \U 1 = 0) = A/"(-0.0002, 0.9108), = 1) = JV(-0.0001, 1.0446), 

f(X 2 \U 2 = 0) = A/"(-0.0021, 1.1358), f{X 2 \U 2 = 1) = AT(-0.0042, 0.7283). 

The normalized minimum distortion, defined as J [g PA ,c(xi, x 2 ) — f p (x 1 , x 2 )] 2 dxidx 2 / J fp(xi, x 2 )dxidx 2 
is 0.137%. 

The approximation (a cross section of the two dimensional densities) is shown in Fig. |2l 
If we take p = 0.6 which violates Lemma |3] then the optimal solution from (|2TI) is shown in Fig. [3] 
We can see that the error in this case is more. Now the normalized marginal distortion is 10.5 %. 
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Fig. 2. Cross section of the approximation of the joint Gaussian with p=0.3. 



C. Generalizations 

The procedure mentioned in Section HV-AI can be extended to systems with general discrete alphabets, 
multiple sources, lossy transmissions and side information as follows. 




Fig. 3. Cross section of the approximation of the joint Gaussian with p=0.6. 



Consider N > 2 users with source i taking values in a discrete alphabet U. L . In such a case for each user 
we find P(Xi — \ Ui = Ui), UiElAi using a mapping mentioned as in (fT71) -(l20l) to yield jointly Gaussian 



{X X ,X 2 



X 



N 



If Zi and Z 2 are the side information available, then we use f(Xi = .\Ui, Zj), i — 1, 2 as in (fT71)-(l20l) 
and obtain the optimal approximation from (TfTT) . 

For lossy transmission, we choose appropriate discrete auxiliary random variables W{ satisfying the 
conditions in Theorem 1. Then we can form (X 1 ^X 2 ) from (VFi,W 2 ) via the optimization procedure 

(ED- 



V. Gaussian sources over a GMAC 

In this section we consider transmission of correlated Gaussian sources over a GMAC. This is an 
important example for transmitting continuous alphabet sources over a GMAC. For example one comes 
across it if a sensor network is sampling a Gaussian random field. Also, in the application of detection 
of change ([40]) by a sensor network, it is often the detection of change in the mean of the sensor 
observations with the sensor observation noise being Gaussian. 

We will assume that (Ui n , U 2n ) is jointly Gaussian with mean zero, variances of, i = 1, 2 and correlation 
p. The distortion measure will be Mean Square Error (MSE). The (relaxed) sufficient conditions from 
(fT2l-(fT4l) for transmission of the sources over the channel are given by (these continue to hold because 
Lemmas |T]{3] are still valid) 



J(t7i; Wi|W 2 ) < 0.5 log 



1 



Pip- - p 



I(U 2 ;W 2 \Wi) < 0.5 log 



1 + 



;,2\ 



(22) 



IiU^U^W^W^ < 0.5 log 



1 + 



Pi + P 2 + 2p^I\P 2 



where p is the correlation between (Xi,X 2 ) which are chosen to be jointly Gaussian, as in Section ITVl 
We consider three specific coding schemes to obtain W\, W 2 , X\, X 2 where (VKi,^) satisfy the 
distortion constraints and (Xi,X 2 ) are jointly Gaussian with an appropriate p such that (|22l) is satisfied. 
These coding schemes have been widely used. The schemes are Amplify and Forward (AF), Separation 
Based (SB) and the coding scheme provided in Lapidoth and Tinguely (LT) [27]. We have compared the 
performance of these schemes in [34]. The AF and LT are joint source-channel coding schemes. In [27] 
it is shown that AF is optimal at low SNR. In [34] we show that at high SNR LT is close to optimal. SB 
although performs well at high SNR, is sub-optimal. 

For general continuous alphabet sources (Ui,U 2 ), no necessarly Gaussian, we vector quantize Ui,U 2 
into U™,U 2 . Then to obtain correlated Gaussian codewords (X^Xg) we can use the scheme provided 
in Section IIV-AI Alternatively, use Slepian-Wolf coding on ([/", U 2 ). Then for large n, and U 2 are 
almost independent. Now on each i = 1,2 we can use usual independent Gaussian codebooks as in 
a point to point channel. 

VI. Conclusions 

In this paper, sufficient conditions are provided for transmission of correlated sources over a multiple 
access channel. Various previous results on this problem are obtained as special cases. Suitable examples 
are given to emphasis the superiority of joint source-channel coding schemes. Important special cases 
of correlated discrete sources over a GMAC and Gaussian sources over a GMAC are discussed in more 
detail. In particular a new joint source-channel coding scheme is presented for discrete sources over a 
GMAC. 

Appendix I 
Proof of Theorem 1 

The coding scheme involves distributed quantization (W 7 "™, W 2 ) of the sources and the side information 
(C/f, Z™), (U 2 , Z 2 ) followed by a correlation preserving mapping to the channel codewords. The decoding 
approach involves first decoding (W™, W 2 ) and then obtaining estimate (U'i,U 2 ) as a function of 
(W", W 2 ) and the decoder side information Z n . 

Let T e n (X, Y) denote the weakly e-typical set of sequences of length n for (X, Y) where e > is an 
arbitrarily small fixed positive constant. We use the following Lemmas in the proof. 



Markov Lemma: Suppose X <-> Y <-> Z. If for a given (x n ,y n ) G T™(X,Y), Z n is drawn according 
to nr=i^(' 2 *l^)' men w ^ tn hi§ n probability (x n ,y n , Z n ) G T e n (X, Y, Z) for n sufficiently large. 

The proof of this Lemma for strong typicality is given in [8]. We need it for weak typicality. By the 
Markov property, (x n , y n , z n ) formed in the statement of the Lemma has the same joint distribution as 
the original sequence (X n ,Y n , Z n ). Thus the statement of the above Lemma follows. In the same way 
the following Lemma also holds. 

Extended Markov Lemma: Suppose W l <-> U 1 Z 1 <-> U 2 W 2 Z 2 Z and W 2 <-> U 2 Z 2 <-> U 1 W 1 Z 1 Z. If 
for a given (w™, u%, z™, zV, , z n ) G T^{U X -, U 2 , Z 1; Z 2 , Z), and W^™ are drawn respectively according 
t0 IYt=iP( w ii\ u iii z ii) and U."=iP( w 2i\u 2i ,z 2i ), then with high probability (u?,v%,z?,z2,z n ,W?,W?) G 
T e n (C/i, C/ 2 , Zi, Z 2 , Z, W^i, W 2 ) for n sufficiently large. 

We show the achievability of all points in the rate region (1). 

Proof: Fix p(wi\ui, zi),p(w 2 \u 2 , z 2 ),p(xi\wi),p(x 2 \w 2 ) as well as /dti(.) satisfying the distortion 
constraints. First we give the proof for the discrete channel alphabet case. 

Codebook Generation: Let R { = I(Ui, Zf, Wi) + 5, i 6 {1,2} for some 5 > 0. Generate 2 nR * codewords 
of length n, sampled iid from the marginal distribution p(w.j),i G {1,2}. For each w™ independently 
generate sequence X™ according to YYj=iP( x ij\ w ij)^ e {1, 2}-. Call these sequences Xi(wf),i G 1,2. 
Reveal the codebooks to the encoders and the decoder. 

Encoding: For % G {1,2}, given the source sequence Uf and Zf, the i th encoder looks for a codeword 
W? such that (Uf, Zf , W t n ) G T?(U h Z h Wf) and then transmits X^W?). 

Decoding: Upon receiving Y n , the decoder finds the unique (W™, Wg) pair such that 
(Wf, Wg, xi(WT), x 2 (W 2 n ), F n , Z n ) G T e n . If it fails to find such a unique pair, the decoder declares an 
error and incurres a maximum distortion of d max (we assume that the distortion measures are bounded; 
at the end we will remove this condition). 

In the following we show that the probability of error for this encoding-decoding scheme tends to zero 
as n — > oo. The error can occur because of the following four events E1-E4. We show that -P(Ei) — > 0, 
for i = 1,2,3,4. 

El The encoders do not find the codewords. However from rate distortion theory ([13], page 356), 
lim^oo P{E 1 ) = if R\ > I(Ui, Zi, W t ),te 1, 2. 

E2 The codewords are not jointly typical with (Y n ,Z n ). Probability of this event goes to zero from 
the extended Markov Lemma. 



E3 There exists another codeword u>" sucn that (w™, W 2 , Xi(w™), x 2 (W 2 ), 
Y n ,Z n ) G Tf. Define a= (w?, W£, Xi(wf), x 2 (W 2 n ), F n , Z n ). Then, 

P(E3) = Pr{There is«)^<:ae T™} < Pr i a e T e"} 



The probability term inside the summation in (|23l ) is 

(xi(.),x 2 (.),y n ):aeT« 

< ]T Pr^^K Wx 2 K), y n \wl z n } 

(ii(.),i2(.),l/ n ):a6l? 

< 2™ H ( x i> x 2,^|VKi,H/ 2 ,^)2-™{- H '(^i|M / i)+- H '(- , ' : 2,^|VK2,Z)-4e} 



But from hypothesis, we have 



Pf(X 1? X 2 , Y\W U W 2 , Z) - HiX^Wd - H(X 2 , Y\W 2 , Z) 

HiX^Wr) + H{X 2 \W 2 ) + P(r|X 1; X 2 ) - H(X 1 \W 1 ) - H(X 2 , Y\W 2 , Z) 

H{Y\X 1 ,X 2 )-H{Y\X 2 ,W 2 ,Z) 

H{Y \X ± , X 2 , W 2 , Z) - H(Y\X 2 , W 2 , Z) = - I{X X - Y\X 2 , W 2 , Z). 



Hence, 



Pr{«, W 2 n ,Xi«),x 2 (W r 2 n ),F ?1 ,Z n ) G T r e 1 } < 2 -^(^i^l^2,W2,z)-66} 

Then from ([23]) 

P(E3) < ^ 2 - n{I{Xv ' YlX2 ' W2 > z) - 6t} 

= |{< : «, <, 2") G T e n }|2- n{/(Xi;y|X2 ' W2 ' z) - 6e} 

< |{<}|Pr{<, w™, z n ) G T e n }2- n{/(Xi;y|X2 ' W/2 ' z) - 6e} 

< 2 n { / (^i> z i; w i)+ 5 }2 _ "^ / ^ i;W2 ' z ^ 3 ^2 _n ^ / ^ Xi;y ' X2 ' M/2,z - ) ~ 6 ^ 
2 n { / ( c/ i> z i^il w '2,^)}2- n { / ( x i; y l- , ' : 2,W2,2)-9e-5} 



In (1251 ) we have used the fact that 

I(U U Z i; W x ) - I(Wx\ W 2 , Z) = H(Wx\W 2 , Z) - H(Wx\Ux, Z x ) 
= H(Wx\W 2 , Z) - H(Wx\Ux, Zx, W 2 , Z) = I(U U Z x - W X \W 2 , Z). 

The RHS of ([25]) tends to zero if I(U U Z x ; W X \W 2 , Z) < I(Jf i; Y\X 2 , W 2 , Z). 



Similarly, by symmetry of the problem we require I(U 2 , Z 2 \ W 2 |Wi, Z) < I(X 2 \ Y\X 1: Wx, Z). 



E4 There exist other codewords and such that a=(w™, w 2 , xx(wi),x 2 (w 2 ), Y n , Z n ) e T". Then, 

P(E4) = Pr{There is «>£) ^ : a G T e n } 

< ^ Pr{aeT?}. (26) 



The probability term inside the summation in (|26l ) is 

< ^ Pr{xx(w?), x 2 (w^),y n \wl w$, z n }p(wl w», z n ) 

(ii(-)^(.),!/™):a6T» 

< Pr i x i «) l^i }-Pr{x 2 (w2 ) 1^2 }Pr{y n |z"} 

< 2~ n{H ^ xi \ wi)+H ( x2 \ w2)+H{Y \ z) ~^ 

(si(.),a;2(.),y n ):a6r (! " 

< 2 ri ^( x i' x 2,^|Wi,iy 2 ,^)2-™{^(^i|M / i)+^(^2|W2)+H(y|z)-7e} 

But from hypothesis, we have 

H(Xx,X 2 , Y\Wx, W 2 , Z) - H{Xx\Wx) - H(X 2 \W 2 ) - H(Y\Z) 
= H(Y\Xx,X 2 ) - H(Y\Z) = H(Y\Xx,X 2 ,Z) - H(Y\Z) = —I(X 1} X 2 ; Y\Z). 

Hence, 

Pr{(w%,wZ,x 1 (w?),x 2 (wZ),y n ,z n ) G Tf} < 2- n ^ x ^ Y \ z ^ . (27) 



Then from 426} 



(wj,u! 2 V n )eT e n 

: «>£,z n ) G r e n }|2- n ^ Xl ' X2;y l^- 7e > 

|{^}||{^}|Pr{«,^,z n ) G r e "}2-"^ Xl ' X2;y l z )- 7 ^ 
2n{/(f/i,^i;Wi)+/(t/a,^a;Wj)+2<y} 

2-n{/(Wi;VK2,Z)+/(H/2;VKi,^)+/(VKi;W2|Z)-4e}2-n{/(Xi,X 2 ;y|Z)-7e} 
2n{/(C/i,C/2,^i,^2;W r i ! W2|^)}2-n{/(Xi,X 2 ;y|K)-lle-2<5} 

The RHS of the above inequality tends to zero if I(U U U 2 , Z u Z 2 ; W X W 2 \Z) < I{X 1 ,X 2 ; Y\Z). 

Thus as n — > oo, with probability tending to 1, the decoder finds the correct sequence (W±, W 2 ) which 
is jointly weakly e-typical with (C/j 1 , U£, Z n ). 

The fact that (W™, W?) is weakly e-typical with (£/j\ f/ 2 n , Z n ) does not guarantee that W£, Z n ) 

will satisfy the distortions Di, D 2 . For this, one needs that (W™, W 2 ) is distortion-e-weakly typical ([13]) 
with (Ui,U2,Z n ). Let Tg e denote the set of distortion typical sequences. Then by strong law of large 
numbers P(Tp € \T™) — > 1 as n — > oo. Thus the distortion constraints are also satisfied by (W™, W.f) 
obtained above with a probability tending to 1 as n — > oo. Therefore, if distortion measure d is bounded 
lim^oo E[d{U?, U?)] < A + e, i = 1,2. 

For continuous channel alphabet case (e.g., GMAC) one also needs transmission constraints P[gj(X,;)] < 
«i, i = 1,2. For this we need to ensure that the coding scheme chooses a distribution p(xi\wi) which 
satisfies E[gi{Xi)\ < oii — e. Then if a specific codeword does not satisfy - J2k=i 9i( x k) < a ^ one declares 
an error. As n — > oo this happens with a vanishingly small probability. 

If there exist u* such that E[di(Ui, u*)] < oo, i — 1, 2, then the result extends to unbounded distortion 
measures also as follows. Whenever the decoded (W™, W£) are not in the distortion typical set then we 
estimate (t>r,t> 2 n ) ™ {u\ r \u* 2 n ). Then for i = 1,2, 

0?)] < A + e + P[d(C/f,< n )l{(T« e )c } ]. (28) 

Since E[d(U™, u* n )} < oo and P[(Tg J c ] — > as n — > oo, the last term of ( |28l) goes to zero as n — > oo. 



P(E4) < 

< 
< 



Appendix II 

Proof of converse for lossless transmission of discrete correlated sources over 

orthogonal channels with side information 

Let P% be the probability of error in estimating \J 2 from (F", F 2 n , Z n ). For any given coding- 
decoding scheme, we will show that if — > then the inequalities in (flOl) specialized to the lossless 
transmission must be satisfied for this system. 

Let | |Wj|| be the cardinality of set Ui. From Fano's inequality we have 

-H{U^U^\Y^Y^Z n ) < hog(\\U^\\)P: + - 
n n n 

= P:(log\\U 1 \\+log\\U 2 \\) + -. 

n 

Denote P n {log\\U x \\ + log\\U 2 \\) + ± by A n . As P„ e -> 0,A n -> 0. 
Since, 

^([/j 1 , £#|Yi n , 17, Z n ) = if^F/ 1 , F 2 n , Z n ) + H{U%\U^ F™, Y£, Z n ), 

we obtain if^flF™, F 2 n , Z n )/n < X n . Therefore, because {£/"} is an iid sequence, 

nH{U x ) = H(U?) 

= H{U?\YT, Y 2 n , Z n ) + /([/"; F™, F 2 n , Z n ) 

< nA n + /([/ 1 ";F 1 ",F",Z"). (29) 

Also, by data processing inequality, 

/([/?; F", f/ 2 n , Z n ) = J(E/*; f/ 2 n , Z n ) + I(U» ; Y?|E#, Z n ) 

< /(f/f; f/ 2 n , Z n ) + /(A7;F 1 n |f/ 2 r \Z n ). (30) 

But, 

/(X™; *?|E#, Z n ) = H{Y 1 n \U^, Z n ) - H(Y?\X?) < H{Y?) - H(Y?\X?) 

n n n n 

i=l i=l i=l i=l 
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= ^/(X H ;F K ). (31) 

i=l 



The inequality in the second line is due to the fact that conditioning reduces entropy and the equality in 
the fifth line is due to the memoryless property of the channel. 
From ([29]>, (SB and <HD 

tf(tfi) < -Y / I(Ui i ;U 2i ,Z i )+-Y / I{X li -Y li ) + \ n . 

i=l i=l 

We can introduce time sharing random variable as done in [13] and show that H(Ui) < I{U\\ U 2 , Z) + 
This simplifies to E{U X \U 2 ,Z) < /(X^F). 
By the symmetry of the problem we get H(U 2 \Ui, Z) < I(X 2 ; Y 2 ). 
We also have 

nH(U u U 2 ) = H{U?,U2) 

= H(U?, C/ 2 n |n n , Z n ) + 0J; Y[\ Y 2 n , Z n ) 
< I(U?,U2;Yr,Y 2 n ,Z n )+n\ n . 

But 

I{U?,U2-Y?,Y?,Z n ) = I{U?,U$-,Z n )+I{U?,U2-Y?,Y 2 n \Z n ) 

< I(U[\ U2; Z n ) + I(X?, X r 2 l ; Yf, Y 2 n \Z n ). 

Also, 

I(X?, X r 2 l ; F™, Y 2 n \Z n ) = H(Y{ 1 , Y 2 n \Z n ) - H(Y™, Y 2 n \X?, X 2 n , Z n ) 

< H(Y{ 1 , Y 2 n ) - H(Y?\X?) - H(Y 2 n \X r 2 l ) 

< H(Y{ 1 ) + H{Y 2 n ) - H(Y?\X?) - H(Y 2 n \X r 2 l ). 

Then, following the steps used above, we obtain H(Ui, U 2 \Z) < I(Xi, Yi)+I(X 2 ; Y 2 ). ■ 

Appendix III 
Proofs of Lemmas in Section 4 

Proof of Lemma 1: Let A = I{X\\ Y\X 2 , U 2 ) — I{X%\ Y\X 2 ). Then denoting differential entropy by 

h, 



A = h(Y\X 2 , U 2 ) - h(Y\X x ,X 2 , U 2 ) - [h(Y\X 2 ) - h(Y\X u X 2 )}. 



Since the channel is memoryless, h(Y\Xi, X 2 , U 2 ) = h(Y \X\, X 2 ). Thus, A < 0. 
Proof of Lemma 2: Since 



I{X 1 ,X 2 ; Y) = h(Y) - h(Y\X u X 2 ) = h{X t +X 2 + N)- h(N), 

it is maximized when h(Xx + X 2 + N) is maximized. This entropy is maximized when X\ + X 2 is 
Gaussian with the largest possible variance = Pi + P 2 . If (X 1 , X 2 ) is jointly Gaussian then so is Xi +X 2 . 
Next consider I(X\\ Y\X 2 ). This equals 

h(Y\X 2 ) - h(N) = h(X 1 + X 2 + N\X 2 ) - h(N) = h(X l + N\X 2 ) - h(N) 

which is maximized when p(xi\x 2 ) is Gaussian and this happens when Xx,X 2 are jointly Gaussian. 
A similar result holds for I(X 2 ; Y\Xi). ■ 
Proof of Lemma 3: Since X\ <-> U\ <-> U 2 <-> X 2 is a Markov chain, by data processing inequality 
/(Xl; X 2 ) < /(C/i ; C/ 2 ). Taking X 1 , X 2 to be jointly Gaussian with zero mean, unit variance and correlation 
p, /(Xi, X 2 ) = 0.5log 2 (j±z). This implies p 2 < 1 - 2- 2/ ( c/ i- c/ 2). ■ 
Proof of Lemma 4: By Stone- Weierstrass theorem ([25], [36]) the class of functions (xi,x 2 ) i— > 
e 2^"( :ri ~ ai ) e 2^( i2_a2 ) can b e shown to be dense in C under uniform convergence where C is the set 
of all continuous functions on 3? 2 such that limnxii^oo 1/(^)1 — . Since the jointly Gaussian density 
(xi, x 2 ) i— > e^ 7 ^p 2 is in Co, it can be approximated arbitrarily closely uniformly by the functions 

■ 

Proof of Lemma 5: Since 

I(X m i, X m2 ; Y m ) = h(Y m ) — h(Y m \X m i, X m2 ) = h(Y m ) — h(N), 

it is sufficient to show that h(Y m ) — > h(Y). From (X m i, X m2 ) — >(Xi, X 2 ) and independence of 
(X ml ,X m2 ) from N, we get Y m = X ml + X m2 + N-^X 1 +X 2 + N = Y. Then f m / uniformly implies 
that f m (Y m )-—>f(Y). Since f m (Y m ) > 0, f{Y) > a.s and /o(7 being continuous except at 0, we obtain 
logf m (Y m )-^logf(Y). Then uniform integrability provides I(X ml ,X m2 ; Y m ) -> I(X h X 2 ; Y). ■ 
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